DEEP LEARNING CA2 PART A: Generative Adversarial Networks (GAN)

Yek Yi Wei
P2107631
DAAA/FT/2B/03

Import Libraries

Background Research

The CIFAR-10 dataset (Canadian Institute For Advanced Research) is a collection of images that are commonly used to train machine learning and computer vision algorithms. It is one of the most widely used datasets for machine learning research. The CIFAR-10 dataset contains 60,000 32x32 color images in 10 different classes. The 10 different classes represent airplanes, cars, birds, cats, deer, dogs, frogs, horses, ships, and trucks. There are 6,000 images of each class.

Check GPU

Metadata

Each image is 32 pixels in height, 32 pixels in width and 3 colour channels (RGB), for a total of 3072 pixels in total. Each pixel contains an integer value, between 0 and 255. There are a total of 10 classes within the dataset, the classes are represented as:

EDA

Loading training dataset

There is a uniform distribution between the 10 classes. Hence there is no bias towards a particular class

Data Preprocessing

Pixel Normalization / Rescaling for X_train(to bring the pixel values down to range -1 to 1
Convert class vector (integers) to binary class matrix for y_train to be used in CGAN (conditional GAN) as each node in the input layer represents a class

Prepare ImageDataGenerator for data augmentation

Use ImageDataGenerator

I decided not to Augment the dataset too much as the augmented data distribution can be quite different from the original one, affecting the generator and discriminator in a GAN

Advantages

Adding the Augmented data to the default data

Base Gan Model (Self-Built)

Building Generator Model

A type of deep learning model for generative tasks by transforming a random noise vector into a high-dimensional data sample through multiple layers of transformations. The generator is trained in collaboration with a discriminator, which tries to distinguish the generated data from the real data.

Model Architecture

Visualising the Generator Model

Building Discrimator Model

A type of deep learning model for generative tasks to provide feedback to the generator on how to improve the generated data. The generator and discriminator are trained simultaneously in a two-player minimax game, where the generator tries to generate realistic data that fool the discriminator, and the discriminator tries to correctly classify the data as real or fake.

Model Architecture

Visualising the Discrimator Model

A feature-complete GAN class, overriding compile() to use its own signature, and implementing the entire GAN algorithm in train_step:

A simple custom callback that logs:

Customize the behavior of a Keras model during training, evaluation

Base GAN Evaluation

Plotting the best generated images at specific epoch

Fréchet Inception Distance

Tensorflow implementation of the "Fréchet Inception Distance" (FID) between two image distributions, along with a numpy interface. The FID can be used to evaluate generative models by calculating the FID between real and fake data distributions (A lower FID score indicates that the generated images are of higher quality and are more similar to the target data distribution)

Inception Score (IS)

An objective metric for evaluating the quality of generated images, specifically synthetic images output by generative adversarial network models.

BASE Model with Augmented data

Initialize GANMonitor to save weights to different folder

CGAN (Conditional GAN)

Conditions the network with additional information such as class labels. It means that during the training, we can pass images to the network with their actual labels

Conditional Generator

Model Architechture

Visualising the Generator Model

Conditional Discriminator

Model Architecture

Visualising the Discriminator Model

Training

Add a d_xy_tracker and d_g_zy_tracker
Change the training function, as the conditional generator takes in 2 inputs (latent vector z and condition y, which is the classes)

Initialize GANMonitor to save weights to different folder

CGAN Evaluation

Initialize the FID function with additional information(class label) for CGAN

Fréchet Inception Distance

Tensorflow implementation of the "Fréchet Inception Distance" (FID) between two image distributions, along with a numpy interface. The FID can be used to evaluate generative models by calculating the FID between real and fake data distributions (A lower FID score indicates that the generated images are of higher quality and are more similar to the target data distribution)

Inception Score (IS)

An objective metric for evaluating the quality of generated images, specifically synthetic images output by generative adversarial network models.

CGAN with Data Augmentation

Initialize GANMonitor to save weights to different folder

CGAN with Augmented Data Evaluation

Conclusion

Further Improvements